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1.  Introduction 


An  automated  technique  has  been  developed  and  evaluated  to  reconstruct  3-D  binary  images  of 
breast  calcifications.  The  reconstruction  algorithm  consists  of  segmentation,  motion  correction, 
correlation  between  views,  3-D  binary  limited-view  reconstruction  of  each  calcification,  and  3-D 
rendering.1  This  method  relied  upon  significant  human  intervention  and  judgment  in 
producing  the  final  3-D  image.  In  this  grant,  we  sought  methods  to  automate  these  tasks. 
Required  were  robust  methods  of  identifying,  segmenting  and  correlating  (or  pairing) 
calcifications  between  views. 

The  tasks  of  identifying  and  segmenting  calcifications  have  been  attempted  on  numerous 
occasions.  These  previous  attempts  have  been  used  almost  exclusively  in  computer  aided 
diagnosis  (CAD)  systems.  In  such  systems,  the  desire  is  to  capture  a  sufficient  number  of 
calcifications  to  identify  clusters  of  suspicious  calcifications  for  evaluation  by  a  human 
observer.  Significant- effort  is  expended  on  eliminating  false  positives.  By  imaging  the 
breast  at  3  separate  angles  with  known  spatial  alignment,  we  have  the  advantage  that  true 
calcifications  are  present  in  each  of  the  images,  while  most  spurious  or  non-calcified 
signals  are  found  in  only  one  image.  This  additional  constraint  allows  us  to  segment 
more  calcifications  in  each  image,  admittedly  with  a  high  false  positive  rate.  The  task  of 
correlation  between  the  images  quite  naturally  reduces  the  false  positive  rate  in  the  3-D 
image. 

The  work  to  date  is  reviewed  in  this  annual  report. 
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2.  Body 

2.1.  Summary  of  Work  Items 

It  is  useful  to  restate  the  work  items  listed  in  the  original  grant.  They  are  as  follows: 

Task  1:  Compile  database  of  50  selected  cases  (Months  1-2) 

Task  2:  Manually  identify  and  pair  calcifications  in  database  images  (Months  3-4) 

Task  3:  Evaluate  methods  for  automated  identifications,  segmentation  and  correlation  of 
calcifications  (Months  1-24) 

Task  4:  Apply  reconstruction  technique  to  non-calcified  structures  (Months  25-36) 

At  the  current  time,  tasks  1  &  2  are  complete  and  task  3  is  essentially  complete.  A  decision  must 
soon  be  made  as  to  whether  the  remaining  time  show  be  used  to  refine  the  results  of  tasks  1-3  or 
to  undertake  task  4.  It  is  the  preference  of  the  PI  to  perform  this  refinement.  However,  the 
opinion  of  the  DOD  Program  Director  will  be  sought  prior  to  any  change  in  work  items.  In  the 
following  report,  a  discussion  of  the  accomplishments  for  the  period  of  October  1,  1997  until 
March,  2001  will  be  provided.  Due  to  extensions  and  delays  incurred  during  the  performance  of 
this  grant,  a  detailed  summary  of  the  timeline  of  events  affecting  this  grant  will  also  be  offered. 

2.2.  Database  Formation 

We  were  able  to  obtain  image  data  on  130  women  from  two  different  prior  studies.  Both  studies 
were  conducted  under  IRB  review.  The  first  (TJU  IRB  control  #93  .0705)  consisted  of  a 
retrospective  review  of  images  from  74  patients  who  had  had  a  stereotactic  core  biopsy  for  breast 
calcifications,  or  of  breast  tissue  specimens  containing  calcifications.  In  the  latter  case,  the 
specimens  had  been  imaged  in  a  water  bath  to  simulate  breast  tissue  of  equal  thickness  to  a 
normal  breast.  However,  as  we  shall  discuss  below,  these  latter  images  lack  the  complexity  of 
background  structures  that  are  found  in  real  mammograms.  The  second  study  (TJU  IRB  control 
#96.0160)  was  a  prospective  study  of  women  having  core  breast  biopsies.  The  DOD  funded  this 
latter  study.1 

For  each  patient,  three  images  of  the  breast  were  obtained  at  ±15°  of  separation.  Of  the  patient's 
images  in  the  database,  10  were  of  specimens,  while  the  remainder  was  acquired  in  vivo.  Of  the 
130  cases,  42  were  malignant  (32.3%).  A  summary  of  the  patient  race  is  given  in  Table  1 .  The 
racial  distribution  is  similar  to  our  patient  population  as  a  whole. 


Table  1:  Summary  of  Patient  Race  of  the  Image  Database 


IRB  Control  # 

White  ; 

;  Black  i 

BHIB 

Unknown 

Total 

93.0705 

59 

i 

11 

j  i 

. . J . 
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. . 

74  j 

96.0160 

43 

6 

1  | 

6 

56  | 

j 

Total 

102 

17 

i  |  i 

\  \ 

9 

130 

6 


Image  One 


Figure  1  Comparison  of  the  number  of  calcifications  that 
were  manually  segmented  from  each  view  of  the  same  patient 

2.3.  Manual  Evaluation  of  Images 

Two  skilled  observers  evaluated  the  images  in  the  database.  The  first  observer  performed  manual 
segmentation  and  correlation  of  47  images.  The  second  observer  manually  segmented  and 
correlated  calcifications  in  1 10  cases.  Later,  that  same  observer  re-evaluated  the  same  images 
and  segmented  all  visible  calcifications  that  could  not  be  paired  manually.  Note  that  20  cases 
were  eliminated  for  a  number  of  reasons,  including  withdrawal  from  the  study,  insufficient 
numbers  of  radiographically  visible  calcifications,  and  lack  of  radiographically  visible 
calcifications. 

The  consistency  of  the  image  data  was  tested  by  comparing  the  number  of  calcifications 
segmented  by  a  human  observer  in  two  views  of  each  of  1 10  cases.  In  this  experiment,  all  visible 
calcifications  were  segmented,  not  just  those  that  could  be  paired.  Figure  1  shows  the  number  of 
segmented  calcifications  between  two  images  for  each  case.  4  clusters  had  in  excess  of  70 
calcifications.  Linear  correlation  coefficient  is  0.848.  On  average,  15.9  calcifications  were 
identified  per  image,  and  8.7  calcifications  were  paired  per  image  (55%). 

A  comparison  of  the  reconstructed  images  of  47  cases,  generated  by  the  two  different  operators, 
was  used  to  assess  the  inter-operator  variability  of  the  reconstruction  method,  and  to  determine 
the  validity  of  the  manually  segmented  image  data.  Shown  in  figure  2  is  the  correlation  between 
the  two  operators  for  the  number  of  segmented  pairs  of  calcifications.  Agreement  in  segmenting 
individual  calcifications  occurred  in  81%  of  calcifications,  and  agreement  of  pairing  calcifications 
occurred  in  70%  of  pairs.  Linear  correlation  coefficient  of  plotted  data  is  0.993. 
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Observer  One 


Figure  2  Interoperator  variability  of  segmentation  and  correlation  of  calcifications  is 
shown,  by  comparing  the  number  of  paired  calcifications  found  in  each  image. 

In  summary,  a  data  set  of  130  cases  was  obtained.  For  the  purposes  of  testing  the  algorithms  for 
identification,  segmentation  and  correlation  of  calcifications,  1 10  patient  s  images  were  used.  It 
should  be  noted  that  in  some  of  the  work  discussed  below,  we  also  found  it  necessary7  to  eliminate 
the  10  cases  that  consisted  of  specimen  images.  In  these  instances,  the  calcifications  were  too 
obvious.  This  is  due  mainly  to  the  lack  of  overlaying  tissues  that  would  otherwise  confound 
discovery  and  reduce  conspicuity  of  the  calcifications. 

2.4.  Automated  Identification  and  Segmentation 

The  algorithm  first  proposed  in  the  grant  application  was  found  to  be  inadequate.  The  initially 
proposed  method  involved  wavelet  processing  of  the  images.  However,  unfortunately,  the 
algorithm  was  not  intuitive;  instead  it  relied  upon  operational  parameters  that  had  no  physical 
basis.  As  a  result,  while  it  was  capable  of  segmenting  large  numbers  of  calcifications,  it  could  not 
be  easily  tuned,  and  it  was  similarly  difficult  to  differentiate  those  objects  it  had  segmented.  For 
this  reason,  we  rethought  the  segmentation  approach  and  developed  a  new  algorithm.  This 
algorithm  has  the  benefit  of  being  physical.  It  is  also  has  the  advantage  that  it  can  be  tuned  to 
aggressively  segment  potential  calcified  regions.  Thus,  we  can  rely  upon  the  correlation  step  to 
distinguish  between  artifactual  and  actual  calcifications. 

The  algorithm  proceeds  as  follows.  First,  large-scale  trends  are  removed  from  the  image  using  an 
unsharp  mask  with  a  3 1x3 1  kernel.  Next,  spurious  signals  are  eliminated  by  performing 
statistical  analyses  of  5x5  neighborhoods  and  eliminating  points  greater  than  3  times  the 
population  variance.  A  Laplacian  operator  is  applied  on  7x7  regions,  and  then  calcifications  are 
selected  based  upon  a  local  threshold  on  these  data. 

The  segmentation  is  based  upon  a  heuristic  illustrated  below  in  figures  3  &  4.  Rather  than 
compute  the  number  of  connected  regions  as  a  function  of  threshold,  we  calculate  a  connectivity 
factor  based  upon  Euler’s  formula 
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C(t)  =  F(t)  -  E(t)  +  V(t) 


where  V(t)  is  the  number  of  sample  points  (vertices)  below  the  threshold,  E(t)  is  the  number  of 
pairs  of  vertically  or  horizontally  adjacent  points  (edges),  and  F(t)  is  the  number  of  groups  of  4 
pixels  (faces)  in  the  form  of  a  square.  We  have  found  experimentally  that  a  threshold  of  25%  of 
the  maximum  connectivity  value  is  optimal. 


Threshold  (Arbitrary  Units) 


Figure  3  An  example  of  the  connectivity  calculation  applied  to  one  of  the  images  from 
the  database.  The  optimal  threshold  occures  at  25%  of  the  maximum  connectivity  value. 


Figure  4  An  example  of  how  connectivity  varies  as  a  function  of  threshold  The  leftmost  image  has  a  threshold  which 
is  too  low,  the  middle  is  appropriate  (this  threshold  is  determined  by  the  point  on  the  connectivity  graph  were  the 
connectivity  factor  is  25%  of  the  maximum  value),  and  the  rightmost  is  too  high. 
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We  next  eliminate  obvious  false  positives.  Calcifications  within  20  pixels  of  the  edge  of  the 
image  are  eliminated,  as  are  those  containing  fewer  than  4  pixels  (100  pm2).  We  also  test 
calcifications  in  terms  of  contrast.  By  defining  a  region  extending  7  pixels  beyond  the  edge  of  the 
segmented  structure,  we  can  calculate  a  signal-to-noise  ratio.  If  this  ratio  is  less  that  5,  then  the 
calcification  is  eliminated.  Finally,  calcifications  associated  with  long  linear  structures,  such  as 
blood  vessels  are  eliminated.  An  example  of  the  complete  method  is  illustrated  in  figure  5. 


Figure  5  An  example  of  the  complete  segmentation  process,  (upper  left)  An  image  after  large  scale  trends  and 
spurious  signals  have  been  eliminated,  (upper  right)  the  same  image  is  shown  after  the  7x7  Laplacian  operator  is 
applied,  (lower  left)  image  obtained  by  applying  the  25%  threshold  -  at  this  point  many  false  positive  signals  still 
exist,  and  (lower  right)  the  final  output  of  the  segmentation  process. 
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Shown  in  figure  6  is  the  number  of  calcifications  that  were  automatically  segmented  in  each 
image  compared  to  the  two  other  images  in  each  patient’s  dataset.  On  average,  57.7  calcifications 
were  segmented  per  image  (min  16,  max  203)  in  an  analysis  of  109  cases  (327  images).  The 
linear  correlation  coefficient  is  0.777.  There  is  no  correlation  between  the  number  of 
calcifications  seen  and  the  specific  viewing  angle.  The  average  number  of  calcifications  seen 
was  56.3,  57.8  and  59.0  for  the  three  different  views.  This  number  is  approximately  4  times  that 
found  by  the  human  observer.  However,  a  significant  number  are  false  positives. 
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Figure  6  Correlation  of  the  number  of  calcifications  segmented  between  any  two  of 
the  three  views  used  to  reconstruct  the  calcifications 

A  scatter  graph  of  the  minima  of  the  number  of  calcifications  segmented  in  the  three  2-D  source 
images  for  each  of  1 10  cases  and  the  number  of  calcifications  paired  in  3-D  from  those  view's  is 
shown  in  figure  7.  The  minimum  of  the  number  of  calcifications  in  the  3  images  was  on  average 
49. 1,  as  compared  to  an  average  number  of  calcifications  of  57.7.  Of  these,  on  average  8.78  w'ere 
paired  (16.3%).  The  data  can  be  fit  to  a  line  of  the  form  y  =  0.346  x  -  8.18.  The  correlation 
coefficient  is  0.772,  and  the  x-intercept  is  23.6. 

The  number  of  calcifications  correlated  by  the  algorithm  matches  the  number  paired  by  the 
human  observer.  However,  there  are  a  number  of  differences  that  should  be  noted.  First,  there 
was  agreement  between  the  human  and  the  machine  in  about  Vz  of  the  cases.  This  is  not,  in  and  of 
itself,  a  reason  for  concern,  as  there  are  a  number  of  differences  in  the  approaches  used.  For 
instance,  the  machine  matched  calcifications  in  3  view's,  while  the  human  only  matched 
calcifications  in  2  views.  There  were  definitely  calcifications  that  were  only  seen  in  2  of  3  views. 
Methods  of  improving  these  results  are  discussed  in  Section  5.6 
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Figure  7  Correlation  between  the  minimum  number  of  calcifications  segmented  in  the  three 
source  images  and  the  number  of  calcifications  paired  for  the  same  case 

Automated  Correlation 

As  shown  in  figure  8,  a  calcification  falls  on  a  line  between  the  image  of  the  calcification  and  the 
x-ray  focal  spot.  Given  two  views,  their  intersection  should  give  the  location  of  the  calcification; 
a  third  view  is  redundant.  Realistically,  the  lines  are  skew  due  to  patient  motion,  the  finite  size  of 
the  image,  and  uncertainty  in  the  acquisition  geometry. 


Figure  8  Geometry  of  image  acquisition,  showing  the  general  problem  of 
imaging  a  calcification  from  2  views. 
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Goodness 
of  Fit 


Figure  9  An  example  of  the  goodness  of  fit  calculation  for  a  set  of  displacements  in  two  of  the  tliree 
images.  Such  calculations  were  performed  to  account  for  patient  motion  and  the  imprecision  of  the 
motion  of  the  x-ray  tube  arm.  The  goodness  of  fit  criterion  is  determined  from  the  number  of  calcification 
triplets  that  can  be  paired  with  sufficient  proximity. 

We  simultaneously  solve  the  correspondence  problem  and  the  geometry  problem.  We  assume 
that  geometric  uncertainty  can  be  compensated  for  by  shifts  of  the  second  and  third  images.  For 
each  shift,  a  count  is  made  of  the  number  of  triples  whose  positions  are  consistent  with  being  in 
the  shadow  of  a  single  calcification.  The  parameter  space  of  shifts  is  searched  to  maximize  the 
count. 

For  each  possible  triplet,  the  calcification  position  is  found  by  minimizing  the  sum  of  the  squares 
of  the  distances  to  the  three  lines.  The  sum,  jj ,  for  each  possible  triplet  is  sorted  in  ascending 
order,  such  that  each  calcification  shadow  is  used  uniquely.  The  sum  of  l/(l+(x  /cr )  )  is 
calculated  over  these  triplets.  The  result  is  1  if  the  three  shadows  are  perfectly  aligned  and 
becomes  0  if  the  lines  are  highly  skew.  The  value  of  a2  is  related  to  the  distance  between  the 
best-fit  point  and  the  lines  for  shadows  which  are  consistent.  For  this  work  d2  is  such  that  the 
distance  of  best  fit  is  50  pm.  An  example  of  the  fitting  process  (in  a  restricted  2-D  subspace)  is 
shown  in  figure  9. 

We  are  not  aware  of  a  previous  attempt  to  use  such  a  metric  to  correct  for  motion  or  determine 
correspondence  in  similar  algorithms  applied  to  computer-based  vision.  This  is  one  of  the  most 
significant  pieces  of  new  work  in  this  grant,  and  is  being  written  up  as  a  scientific  paper . 
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2.5.  Discussion  and  Summary  of  Scientific  Results 

In  our  work  to  date,  we  have  generated  a  manually  segmented  and  paired  dataset  of  1 10 
patients  images,  which  we  have  used  as  a  “gold  standard”  in  the  evaluation  of  computer 
algorithms  for  identifying,  segmenting  and  correlating  calcifications.  We  have  been  able 
to  develop  two  separate  computer  algorithms.  Both  are  quite  robust.  There  are  a  number 
of  significant  findings  from  this  work  that  will  be  published.  First,  the  use  of  Euler  s 
number  to  determine  connectivity  in  an  automated  fashion  is  unique.  Secondly,  the 
simultaneous  correction  of  patient  motion  and  the  determination  of  correspondence 
between  the  views  is  unique,  and  will  be  published.  At  this  time,  the  algorithms  are  as 
good  as  the  human  observer.  However,  it  now  appears  that  the  algorithm  could  be 
relatively  easily  improved.  Similarly,  the  comparison  to  the  human  observer  can  be 
improved. 

There  are  two  potential  flaws  with  the  existing  work.  First,  the  segmentation  and  pairing 
by  the  human  observer  could  be  better.  We  have  found  that  it  is  significantly  easier  to 
identify  and  pair  calcifications  in  images  that  have  had  all  low-frequency  details 
removed.  This  is  essentially  the  first  step  of  our  automated  process.  We  propose  to 
reanalyze  the  images  by  a  human  observer  following  processing  of  the  images  with  a 
31x31  unsharp  mask.  This  step  would  make  the  search  task  easier,  and  hence  a  larger 
number  of  calcifications  would  be  found,  and  a  larger  number  could  be  paired. 

The  second  flaw  is  with  the  correlation  step.  Currently,  we  only  correlate  findings  seen 
in  all  three  images.  We  now  realize  that  there  are  calcifications  that  are  only  visible  in 
two  images.  We  would  like  to  use  the  motion  correction  that  is  determined  in  the  first 
step  of  3-view  correlation  to  then  be  applied  to  2-view  correlations  between  the  3 
possible  pairs  of  images.  We  believe  that  this  will  result  in  more  calcifications  being 
correlated  between  views  by  the  computer.  We  also  believe  that  this  will  result  in  better 
agreement  between  the  computer  and  the  human  observer. 

The  original  intent  of  task  4,  was  to  add  features  to  the  image  that  would  add  perspective 
to  the  calcification  images.  At  this  time,  it  is  not  clear  what  those  features  would  be. 
Faced  with  the  choice  between  preceding  with  work  item  #4,  and  refining  the  work  to 
date  for  items  1-3;  I,  as  the  PI,  would  prefer  to  continue  to  work  on  the  refinement.  I 
believe  that  this  is  the  best  use  of  resources  and  the  most  likely  to  have  a  beneficial  effect 
in  the  long  term.  The  3-D  images  produced  by  this  method  are  best  when  many 
calcifications  are  segmented  and  correlated.  Expending  effort  on  maximizing  this 
number  is  better  than  tackling  new  and  unknown  problems. 
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2.6.  Discussion  of  Administrative  Issues 

Significant  progress  has  occurred  with  this  grant,  and  a  successful  conclusion  is  likely  in 
the  near  future.  However,  it  is  useful  to  discuss  the  timeline  and  events  that  have  shaped 
the  life  of  this  research  project.  When  first  proposed,  the  work  was  to  be  performed  by 
Andrew  Maidment  (PI),  Michael  Albert  (research  assistant),  and  Emily  Conant  (clinical 
collaborator).  Two  additional  radiologists,  and  a  pathologist  were  included  to  provide 
additional  clinical  assistance  when  necessary.  Prior  to  beginning  the  grant  Dr.  Conant 
left  Thomas  Jefferson  University.  This  was  a  major  loss  as  Dr.  Conant,  among  all  of  the 
radiologists  at  Thomas  Jefferson  University  Hospital,  was  most  familiar  with  the  project 
and  had  previously  contributed  most  to  the  project.  We  have  never  since  had  a 
radiologist  with  her  clarity  of  insight  into  this  clinical  problem. 

Next,  since  the  departure  of  Dr.  Conant,  we  have  had  a  number  of  additional  people  start 
and  leave  the  TJUH  breast  center,  including  Dr.  Stephen  Feig,  Dr.  Dionne  Farria,  Dr. 

Jane  Hughes,  Dr.  Stephen  Lee,  Dr.  Steven  Nussbaum,  Dr.  Barbara  Cavanough,  and 
others.  This  has  had  a  negative  impact  on  all  of  the  research  being  performed  at  the 
breast  center.  Dr.  Catherine  Piccoli  assumed  Dr.  Conant’s  responsibilities,  as  was 
communicated  to  the  DOD.  In  addition,  Dr.  Feig  and  Dr.  Farria  did  work  on  this  grant 
briefly. 

As  related  in  Dr.  Maidment’s  letter  dated  June  3,  1999,  a  waiver  of  the  1998  report  and  a 
one-year  no  cost  extension  were  requested  due  to  the  clinical  load  of  Dr.  Maidment  and 
Dr.  Albert.  Prior  to  that  date,  Drs.  Maidment  and  Albert  were  expending  essentially 
100%  of  their  time  in  support  of  clinical  medical  physics  and  PACS  at  Jefferson.  Prior  to 
this  time,  no  salary  was  drawn  from  the  grant.  The  department  acknowledged  these 
issues,  in  part,  and  hired  2  people  to  assume  some  of  Dr.  Maidment’s  and  Dr.  Albert’s 
clinical  responsibilities.  In  neither  case,  has  this  replacement  been  complete.  As  a  result 
Dr.  Maidment  still  spends  at  least  40%  of  his  time  clinically,  and  Dr.  Albert  spends  at 
least  30%  of  his  time  clinically.  Thus,  while  significant  progress  has  occurred  since  June 
of  1999,  the  work  is  not  complete.  As  such,  task  4  has  not  yet  been  started. 

At  the  same  time,  our  progress  in  Tasks  1-3  has  given  us  additional  insight  that  we  did 
not  previously  possess.  For  example,  we  now  know  which  types  of  cases  are  best  suited 
to  this  type  of  image  reconstruction.  We  have  a  better  understanding  of  what  makes  the 
3-D  rendering  useful  to  different  doctors  (partly  from  our  work  in  the  related  grant1).  We 
are  now  at  a  point  where  given  a  seed  location  for  a  calcification,  we  can  virtually  always 
segment  it.  We  can  also  determine  correspondence  with  better  accuracy  using  the 
computer  than  with  humans  (a  retrospective  analysis  of  computer  generated 
correspondences  actually  made  us  question  the  work  of  the  human  observer).  We  also 
know  that  the  correlation  would  be  improved  if  we  could  pair  calcifications  in  2  views 
(there  are  3  sets  of  pairs  we  can  use  in  each  case).  We  would  only  do  this  last  step  after 
obtaining  all  possible  triplets. 

For  this  reason,  we  would  like  to  recommend  the  following.  We  request  the  input  of  the 
program  office  of  the  DOD.  First,  we  would  like  to  extend  the  completion  date  of  the 
grant  by  an  additional  year.  Although  we  will  not  need  the  full  year  to  complete  the 
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work,  it  will  allow  us  addition  time  to  prepare  this  work  for  publication.  A  formal 
request  for  the  extension  will  be  send  to  the  DOD  shortly. 

Secondly,  the  exciting  results  to  date  have  left  us  highly  motivated  to  proceed,  in  this  last 
year  of  the  grant,  to  refine  the  work  in  tasks  1-3,  rather  than  proceed  with  Task  4. 

Namely,  we  wish  to  reanalyze  the  images  manually,  first  applying  some  of  the  image  pre¬ 
processing  steps  from  the  automated  analysis.  This,  we  have  observed,  makes  the 
calcifications  easier  to  find,  as  they  are  more  obvious  in  their  appearance.  We  also  wish 
to  add  pair-wise  calcification  correlation  after  having  searched  for  triplets.  This 
overcomes  the  problem  that  some  calcifications  are  only  seen  in  two  of  three  images.  By 
performing  the  triplet  analysis  first,  we  can  apply  motion  corrections  prior  to  pairing. 
Finally,  we  would  like  to  test  the  correlation  analysis  differently.  We  will  do  this  by 
having  human  observers  judge  the  accuracy  of  each  computer  generated  pairing  on  a 
multipoint  scale.  This  will  better  allow  us  to  determine  the  performance  of  the  algorithm. 
Again,  we  would  appreciate  the  opinion  of  DOD  in  this  matter,  and  seek  your  permission 
to  alter  the  proposed  work  as  outlined  above. 
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3.  Key  Research  Accomplishments 

The  following  is  a  list  of  key  research  accomplishments  resulting  from  this  work: 

•  Developed  a  database  of  1 10  biopsy-proven  cases,  with  3  digital  images  of  each  case 

•  Developed  a  set  of  segmented  images  from  each  of  1 10  cases.  In  these  data,  each 
calcification  from  all  three  views  of  each  patient  was  manually  identified,  and  semi- 
automatically  segmented. 

•  Developed  a  set  of  manually  determined  correspondences. 

•  These  datasets  were  used  to  develop  an  automatic  identification  and  segmentation 
algorithm  that  tested  each  point  in  an  image  as  a  potential  seed  point  and  then  tested  each 
resultant  segmented  region  for  validity  as  a  potential  calcification.  A  key  feature  of  this 
algorithm  w'as  the  use  of  Euler’s  number  to  determine  connectivity: 

•  The  above  datasets  were  also  used  to  develop  an  automatic  correspondence  algorithm. 
The  algorithm  used  a  weighted  summation  that  allowed  us  to  simultaneously  correct  for 
patient  motion  and  determine  optimal  correspondence. 
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5.  Conclusions 

In  conclusion,  we  have  developed  automated  algorithms  for  identifying,  segmenting,  and 
correlating  calcifications  in  3-D,  using  3  sources  images  acquired  at  15  degree 
increments.  The  algorithms  have  been  tested  with  previously  acquired  clinical  data, 
which  was  arranged  into  a  database,  and  was  analyzed  by  human  observers  for  the 
purpose  of  developing  a  gold  standard  for  the  reconstructions.  The  algorithms  have 
worked  very  well.  The  use  of  Euler’s  formula  for  connectivity  analysis  and  the 
simultaneous  correction  of  image  correlation  and  image  motion  are  particularly 
noteworthy  accomplishments.  Further  work  remains,  and  will  be  performed  in  the  next 
year. 

With  regard  to  the  choice  of  future  work,  we  are  nearly  complete  the  work  originally 
proposed  in  Tasks  1-3.  Task  4  is  likely  to  prove  difficult,  and  from  the  insight  gained 
over  the  last  few  years,  we  believe  that  our  effort  would  best  be  spent  on  further 
improving  the  algorithms,  and  on  refining  the  metrics  used  to  calculate  the  accuracy  of 
the  algorithms. 
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7.  Appendices 

Attached  to  this  report  is  a  preprint  of  the  paper  written  for  the  presenation  by  A.D.A. 
Maidment  and  M.  Albert,  entitled  "Automated  Reconstruction  of  3-D  Calcifications". 
This  work  was  presented  at  the  5th  International  Workshop  on  Digital  Mammography  in 
Toronto,  Canada  on  June  14,  2000. 
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1.  Introduction 

Conventional  mammography  fails,  in  part,  due  to  the  processes  of  projection  and  superposition. 
These  processes  occur  whenever  a  2-dimensional  image  is  produced  from  a  3-dimensional 
object.  We  have  attempted  to  overcome  this  failing  by  producing  3-D  images  of  breast 
calcifications  to  be  used  in  the  determination  of  malignancy.  The  rationale  for  this  procedure 
have  been  discussed  previously,  as  has  a  preliminary  manual  methodology.1,2  The  results  were 
sufficiently  promising  to  justify  additional  work  in  automating  the  technique. 

3-D  images  are  generated  using  a  limited-view  binary  reconstruction  algorithm,  with  source 
images  acquired  from  a  stereotactic  digital  mammography  system  (typically,  three  images 
separated  by  15°  each).  The  initial  method  involved  manually  identifying  calcification  pairs 
between  two  images  and  subsequently  reconstructing  each  calcification  in  3-D  from  these 
projection  data.  In  the  automated  method,  candidate  calcifications  are  determined  in  each  view, 
then  the  correspondence  between  views  is  determined,  and  finally  a  reconstruction  algorithm  is 
applied  to  each  calcification.  In  this  paper,  the  segmentation  and  correspondence  algorithms  are 
described.  These  differ  from  the  work  of  others  (particularly  those  developed  for  CAD),  in  that 


we  are  less  concerned  with  false  positive  segmentations.  We  can  rely  upon  the  correspondence 
algorithm  to  exclude  spurious  objects  that  are  segmented  in  only  a  single  view,  thereby  reducing 
false  negatives. 

2.  Methodology 

The  segmentation  algorithm  begins  by  producing  a  smoothed  version  of  the  image,  formed  by 
taking  the  median  over  31x31  pixel  regions,  and  subtracting  this  from  the  original  image  to 
remove  large  scale  trends.  An  offset  of  2000  is  added  to  the  pixel  values  for  convenience.  Next, 
to  reject  pixels  containing  spurious  signals,  the  mean  and  population  variances  are  computed  in 
5x5  pixel  regions  around  each  point  (excluding  the  pixel  being  tested)  and  if  the  tested  pixel 
differs  from  the  population  mean  by  more  than  3  times  the  population  variance,  the  pixel  s  value 
is  replaced  by  the  population  average.  This  generally  affects  fewer  than  5%  of  the  pixels.  An 
example  is  shown  in  figure  1.  Note  that  in  this  paper,  the  calcifications  are  shown  darker  than 
the  background,  which  is  the  opposite  polarity  from  film. 

The  images  are  then  subjected  to  a  Laplacian  on  7x7  pixel  regions,  in  which  the  second-order 
derivatives  are  estimated  in  a  least-squares  manner,  shown  in  figure  2.  In  the  following,  Im,n  will 
represent  the  processed  pixel  values  shown  in  figure  1  and  Lm,n  will  represent  the  negative  of  the 
Laplacian  as  shown  in  figure  2 

Calcification  candidates  are  selected  based  upon  local  thresholding  of  the  Laplacian  data.  The 
threshold  is  set  locally  based  upon  the  following  algorithm.  Consider  the  segmented  image  as  a 
function  of  threshold.  For  a  very  low  (restrictive)  threshold,  only  the  most  obvious  calcifications 
pass  (figure  3).  As  the  threshold  is  raised,  more  calcification  candidates  are  identified  (figure  4). 
However,  if  the  threshold  is  raised  too  high,  regions  begin  to  merge  and  the  threshold  is  no 


longer  useful  for  identifying  calcifications  (figure  5).  Heuristically,  it  is  clear  that  one  wants  to 
set  the  threshold  in  the  intermediate  range  where  there  are  many  small  regions,  but  below  the 
point  at  which  these  regions  merge  into  a  few  large  regions. 

Computing  the  number  of  connected  regions,  as  a  function  of  threshold  is  computationally 
intensive.  Therefore,  we  calculate  a  connectivity  factor  based  upon  Euler’s  formula, 

C(t)  =  F(t)  -  E(t)  +  V(t)  , 

where  V(t)  is  the  number  of  sample  points  (vertices)  at  which  the  pixel  value  is  less  than  the 
threshold  t,  E(t)  is  the  number  of  pairs  of  horizontally  or  vertically  adjacent  points  (edges)  which 
are  below  the  threshold  t,  and  F(t)  is  the  number  of  groups  of  4  pixels  (faces)  whose  coordinates 
are  of  the  form  ((m,n),(m,n+l),(m+l,n),(m+l,n+l)}.  The  graph  of  this  connectivity  factor  as  a 
function  of  threshold  is  shown  in  figure  6  for  the  same  image  as  in  figure  1.  For  low  values  of 
the  threshold,  no  pixels  pass  so  C(t)=0.  When  only  a  few  pixels  pass,  the  connectivity  factor  is 
approximately  equal  to  the  number  of  connected  regions  that  pass  the  threshold.  The 
connectivity  number  reaches  a  maximum,  and  for  sufficiently  large  values  oft  is  equal  to  1, 
because  for  large  threshold  t  all  pixels  are  accepted  and  the  entire  region  of  interest  becomes  one 
large  segmented  region.  The  value  of  C(t),  as  defined,  can  become  negative  when  connected 
regions  contain  holes,  and  indeed  the  behavior  shown  in  figure  6  has  been  found  to  be  typical  of 
the  mammographic  images  in  our  database.  The  optimum  threshold  is  set  to  correspond  to  the 
value  oft  where  C(t)  is  a  certain  fraction  of  its  maximum.  This  procedure  is  appealing  because  it 
provides  a  method  of  setting  the  threshold  in  which  the  identified  candidates  are  independent  of 
any  monotonic  remapping  of  the  pixel  values. 


Various  modifications  of  this  procedure  could  be  considered;  for  example,  using  region  count 
instead  of  the  connectivity  number.  The  connectivity  number  has  the  advantage  that  the  entire 
graph  can  be  calculated  in  a  single  pass  through  the  region  of  interest.  In  any  case,  for  the  values 
of  threshold  t  of  interest,  the  connectivity  factor  C(t)  approximates  the  number  of  connected 
regions.  Experimentally  we  have  found  that  setting  the  threshold  to  a  value  of  t  where  C(t) 
reaches  25%  of  its  maximum  produces  good  results.  The  connectivity  graph  is  relatively  stable 
for  all  regions  in  the  image.  However,  to  take  into  account  local  variations,  the  above  procedure 
is  performed  for  overlapping  250x250  pixel  regions  which  cover  the  image.  Thus,  the  resulting 
set  of  candidates  is  the  union  of  the  pixels  identified  by  application  of  the  procedure  to  each  of 
the  overlapping  regions.  In  figure  7,  the  regions  identified  using  the  25%  threshold  are  shown  in 
red  superimposed  on  the  same  image  as  in  figure  1. 

Each  connected  region  in  figure7  is  now  considered  a  calcification  candidate,  for  which 
additional  features  are  calculated  to  remove  false  positives.  Any  region  within  20  pixels  of  the 
edge  of  the  image  or  containing  fewer  than  4  pixels  is  removed.  Further  tests  attempt  to  remove 
false  positives  based  upon  the  contrast  of  the  calcification  candidate  relative  to  the  background 
and  the  possible  association  of  the  candidate  with  the  shadow  of  a  larger  structure  within  the 
breast. 

Two  measurements  of  local  contrast  are  calculated,  based  upon  a  quadratic  fit  in  a  neighborhood, 
N,  of  the  calcification.  For  the  purposes  of  this  fit,  those  pixels  identified  as  being  in  the 
candidate  calcifications  are  excluded  from  N.  To  provide  a  small  tolerance  in  the  choice  of 
threshold,  pixels  whose  value  Lm>n  falls  below  the  35%  threshold  determined  from  the  C(t)  graph 
are  excluded.  Except  for  the  pixels  thus  excluded,  the  region  N  is  defined  as  a  square  extending 


7  pixels  beyond  the  candidate  calcification  in  each  direction.  The  goodness  of  the  fit  is  estimated 
by  the  x2  per  degree  of  freedom  of  the  fit 

x2=(l/(N-6))S(Im,„-f(m,n)  )2 

where  the  sum  covers  the  N  points  used  for  the  fit  and  f(m,n)  is  the  quadratic  fit  (which  has  six 
free  parameters).  The  statistical  significance  of  the  signal  can  then  be  represented  by  the  ratio  of 
the  value  %2  calculated  for  the  pixels  in  the  candidate  region  to  the  x2per  degree  of  freedom  used 
in  the  fit.  If  this  ratio  is  less  than  25  (SNR  of  5)  the  candidate  is  rejected.  Additionally,  if  the 
maximal  depth  of  the  candidate  region  relative  to  the  quadratic  fit  (rnax(f(m,n)-Dmin)  for  (m,n)  in 
the  candidate  region)  is  less  than  twice  the  square  root  of  the  x2  per  degree  of  freedom  of  the  fit, 
the  calcification  is  rejected. 

False  calcification  candidates  with  sufficient  contrast  to  pass  these  tests  are  often  associated  with 
larger  structures  in  the  breast,  as  illustrated  in  figure  8.  To  test  for  association  with  a  possible 
larger  structure,  an  attempt  is  made  to  roughly  segment  these  larger  structures.  Using  a  300x300 
pixel  region,  an  estimate  is  made  of  the  typical  pixel-to-pixel  variation 

Oloc  =  ^  (  Im,n—  Im,n+l)  ^  (  Im,n  —  Im+l,n)  ^  (  Im,n—  Im+l,n+l) 

Starting  from  the  calcification,  a  region  is  grown  to  include  all  pixels  with  value  Im,n  Oioe  below 
the  average.  Figure  9  shows  the  approximate  segmentation  of  these  structures  for  the 
calcification  candidates  in  figure  8.  If  the  resulting  region  is  larger  than  1000  pixels,  the  region  is 
rejected.  If  the  maximum  contrast  is  less  than  twice  the  population  variance  of  the  pixel  values 
in  the  larger  region,  then  the  candidate  is  also  rejected  on  the  grounds  that  it  is  really  a 
continuation  of  that  region.  Further,  if  the  difference  in  the  average  value  of  the  pixels  inside 
the  calcification  candidate  and  the  average  of  the  pixel  values  in  the  larger  segmented  region 


differ  by  less  than  the  sum  of  the  population  variances  for  these  two  regions,  the  calcification  is 
also  rejected.  If  this  difference  is  less  than  twice  the  population  variance  of  the  entire  300x300 
region,  the  candidate  is  also  rejected. 

Additionally,  some  false  positive  calcification  candidates  are  associated  with  distinctly  linear 
features  in  the  breast  image,  and  these  candidates  can  be  highly  elongated  in  the  direction  of  this 
linear  feature.  To  test  for  this,  the  two  pixels  in  the  calcification  candidate  whose  inter-pixel 
distance  is  maximal  are  found.  The  line  running  through  this  is  defined  as  the  long  axis  of  the 
calcification.  The  width  of  the  candidate  is  then  determined.  If  the  ratio  of  the  width  to  the 
length  is  less  than  0.25,  we  attempt  to  locate  a  linear  feature  along  the  axis  of  the  calcification. 
First,  the  maximal  distance,  d_max,  of  any  pixel  in  the  candidate  from  the  long  axis  is  found. 
Second,  a  square  region  of  interest  with  side  length  three  times  the  length  of  the  calcification  is 
identified.  Three  categories  of  pixels  are  identified:  those  between  2  and  4  times  dmax  on  one 
side  of  the  axis,  those  within  d_max  of  the  axis,  and  those  between  2  and  4  times  d_max  on  the 
opposite  side  of  the  axis.  Pixels  identified  as  being  inside  calcification  candidates  are  excluded 
from  these  regions.  If  the  average  of  the  regions  on  both  sides  is  greater  than  the  average  of  the 
central  region  by  more  than  5  times  the  population  variances  of  the  regions,  added  in  quadrature, 
this  is  taken  as  evidence  that  the  candidate  is  actually  part  of  a  linear  feature  of  the  breast  and 
rejected. 

As  shown  in  figure  10,  this  somewhat  ad  hoc  set  of  rules  allows  the  rejection  of  most  of  the  false 
positives  that  are  associated  with  larger  structures  in  the  breast.  Figure  1 1  shows  the  result  of  the 
complete  analysis  of  the  region  in  figure  1.  In  this  region,  only  one  or  two  likely  calcifications 


have  been  cut,  and  nearly  all  of  the  identified  calcifications  would  be  reasonable  to  a  human 
observer. 

Correspondence  is  determined  geometrically.  In  projection  mammography,  a  calcification  falls 
on  a  line  between  the  image  of  the  calcification  and  the  x-ray  focal  spot.  Given  two  views,  the 
intersection  of  two  lines  should  give  the  location  of  the  calcification;  a  third  view  is  redundant. 
Realistically,  the  three  lines  are  skew,  due  to  patient  motion,  the  finite  size  of  the  image,  and 
uncertainty  in  the  acquisition  geometry. 

We  simultaneously  solve  the  correspondence  problem  and  the  geometry  problem.  We  assume 
that  geometric  uncertainty  can  be  compensated  for  by  shifts  of  the  second  and  third  images.  For 
each  shift,  a  count  is  made  of  the  number  of  triples  whose  positions  are  consistent  with  being  the 
shadow  of  a  single  calcification.  The  parameter  space  of  shifts  is  searched  to  maximize  the 
count.  For  each  possible  triplet,  the  calcification  position  is  found  by  minimizing  the  sum  of  the 
squares  of  the  distances  to  the  three  corresponding  shadow-to-x-ray-focus  lines.  The  distance,  52, 
calculated  for  each  possible  triplet,  is  sorted  in  ascending  order  so  that  each  calcification  image  is 
used  uniquely.  The  sum  of  l/(l+(52/A2)2  is  calculated  over  all  remaining  triplets.  The  result  is  a 
quantity  which  is  1  if  the  three  shadows  are  in  perfect  agreement  (52=0)  and  becomes  0  if  the 
corresponding  lines  are  highly  skew  (large  S2).  The  value  of  A2  is  related  to  the  distance  between 
the  best  fit  spatial  points  and  the  corresponding  lines  when  these  shadows  are  consistent.  For 
this  work  A2  is  such  that  the  distance  of  the  best  fit  is  0.005  cm  (the  pixel  pitch  of  the  detector). 
An  example  of  this  calculation  is  shown  in  figure  12. 


3. 


Results  and  Discussion 


The  segmentation  and  correspondence  algorithms  have  proven  to  be  quite  robust.  The 
segmentation  algorithm  results  in  far  more  calcifications  per  image  being  segmented  than  with 
human  observers.  On  average,  the  human  observer  segmented  15.9  calcifications  per  image  in  a 
search  for  "all"  calcifications;  the  current  algorithm  averaged  58  calcifications  per  image. 
However,  as  expected,  the  pairing  algorithm  markedly  reduces  the  number  of  calcification 
candidates  in  the  final  3-D  image.  In  120  cases,  a  human  observer  segmented  and  paired  9.0 
calcifications  per  case.  The  computer  algorithm  paired  8.7  calcifications  per  case.  Currently, 
only  48%  of  pairings  match  those  of  the  human  observer  for  cases  of  8  or  more  calcifications. 
We  have  identified  several  reasons  for  this  discrepancy.  First,  on  review  of  the  images,  some 
pairings  appear  to  be  legitimate  and  missed  by  the  human  observer.  Currently,  we  also  miss 
those  pairings  in  which  calcifications  are  visible  in  only  two  views.  Finally,  not  all  calcifications 
are  being  segmented.  Quantitative  evaluation  of  the  algorithms  is  ongoing,  and  concomitant 
refinements  are  being  pursued.  However,  for  the  first  time,  we  have  been  able  to  robustly  and 
automatically  generate  3-D  images  of  breast  calcifications. 
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Figure  7:  Calcification  candidates 
for  image  using  threshold  in  fig.  4 


Figure  8:  A  sample  region  showing 
false  positive  candidates 


Figure  9:  Approximate  segmentation  of  Figure  10:  Remaining 

parenchymal  structures  calcification  candidates 


Figure  11:  Final  set  of 
calcified  regions  from  fig.  1 


Figure  12:  Goodness  of  fit  parameter 
used  for  correspondence  and  motion 
correction 


